171 research outputs found

    An Update on MyoD Evolution in Teleosts and a Proposed Consensus Nomenclature to Accommodate the Tetraploidization of Different Vertebrate Genomes

    Get PDF
    DJM was supported by a Natural Environment Research Council studentship (NERC/S/A/2004/12435).Background: MyoD is a muscle specific transcription factor that is essential for vertebrate myogenesis. In several teleost species, including representatives of the Salmonidae and Acanthopterygii, but not zebrafish, two or more MyoD paralogues are conserved that are thought to have arisen from distinct, possibly lineage-specific duplication events. Additionally, two MyoD paralogues have been characterised in the allotetraploid frog, Xenopus laevis. This has lead to a confusing nomenclature since MyoD paralogues have been named outside of an appropriate phylogenetic framework. Methods and Principal Findings: Here we initially show that directly depicting the evolutionary relationships of teleost MyoD orthologues and paralogues is hindered by the asymmetric evolutionary rate of Acanthopterygian MyoD2 relative to other MyoD proteins. Thus our aim was to confidently position the event from which teleost paralogues arose in different lineages by a comparative investigation of genes neighbouring myod across the vertebrates. To this end, we show that genes on the single myod-containing chromosome of mammals and birds are retained in both zebrafish and Acanthopterygian teleosts in a striking pattern of double conserved synteny. Further, phylogenetic reconstruction of these neighbouring genes using Bayesian and maximum likelihood methods supported a common origin for teleost paralogues following the split of the Actinopterygii and Sarcopterygii. Conclusion: Our results strongly suggest that myod was duplicated during the basal teleost whole genome duplication event, but was subsequently lost in the Ostariophysi ( zebrafish) and Protacanthopterygii lineages. We propose a sensible consensus nomenclature for vertebrate myod genes that accommodates polyploidization events in teleost and tetrapod lineages and is justified from a phylogenetic perspective.Publisher PDFPeer reviewe

    A generic testing framework for agent-based simulation models

    Get PDF
    International audienceAgent-based modelling and simulation (ABMS) had an increasing attention during the last decade. However, the weak validation and verification of agent-based simulation models makes ABMS hard to trust. There is no comprehensive tool set for verification and validation of agent-based simulation models, which demonstrates that inaccuracies exist and/or reveals the existing errors in the model. Moreover, on the practical side, many ABMS frameworks are in use. In this sense, we designed and developed a generic testing framework for agent-based simulation models to conduct validation and verification of models. This paper presents our testing framework in detail and demonstrates its effectiveness by showing its applicability on a realistic agent-based simulation case study

    Analysis of multiplex gene expression maps obtained by voxelation

    Get PDF
    BackgroundGene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions.ResultsTo analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum.ConclusionThe experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists

    Structuring heterogeneous biological information using fuzzy clustering of k-partite graphs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Extensive and automated data integration in bioinformatics facilitates the construction of large, complex biological networks. However, the challenge lies in the interpretation of these networks. While most research focuses on the unipartite or bipartite case, we address the more general but common situation of <it>k</it>-partite graphs. These graphs contain <it>k </it>different node types and links are only allowed between nodes of different types. In order to reveal their structural organization and describe the contained information in a more coarse-grained fashion, we ask how to detect clusters within each node type.</p> <p>Results</p> <p>Since entities in biological networks regularly have more than one function and hence participate in more than one cluster, we developed a <it>k</it>-partite graph partitioning algorithm that allows for overlapping (fuzzy) clusters. It determines for each node a degree of membership to each cluster. Moreover, the algorithm estimates a weighted <it>k</it>-partite graph that connects the extracted clusters. Our method is fast and efficient, mimicking the multiplicative update rules commonly employed in algorithms for non-negative matrix factorization. It facilitates the decomposition of networks on a chosen scale and therefore allows for analysis and interpretation of structures on various resolution levels. Applying our algorithm to a tripartite disease-gene-protein complex network, we were able to structure this graph on a large scale into clusters that are functionally correlated and biologically meaningful. Locally, smaller clusters enabled reclassification or annotation of the clusters' elements. We exemplified this for the transcription factor MECP2.</p> <p>Conclusions</p> <p>In order to cope with the overwhelming amount of information available from biomedical literature, we need to tackle the challenge of finding structures in large networks with nodes of multiple types. To this end, we presented a novel fuzzy <it>k</it>-partite graph partitioning algorithm that allows the decomposition of these objects in a comprehensive fashion. We validated our approach both on artificial and real-world data. It is readily applicable to any further problem.</p

    Building cooperation through health initiatives: an Arab and Israeli case study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ongoing conflict in the Middle East poses a major threat to health and security. A project screening Arab and Israeli newborns for hearing loss provided an opportunity to evaluate ways for building cooperation. The aims of this study were to: a) examine what attracted Israeli, Jordanian and Palestinian participants to the project, b) describe challenges they faced, and c) draw lessons learned for guiding cross-border health initiatives.</p> <p>Methods</p> <p>A case study method was used involving 12 key informants stratified by country (3 Israeli, 3 Jordanian, 3 Palestinian, 3 Canadian). In-depth interviews were tape-recorded, transcribed and analyzed using an inductive qualitative approach to derive key themes.</p> <p>Results</p> <p>Major reasons for getting involved included: concern over an important health problem, curiosity about neighbors and opportunities for professional advancement. Participants were attracted to prospects for opening the dialogue, building relationships and facilitating cooperation in the region. The political situation was a major challenge that delayed implementation of the project and placed participants under social pressure. Among lessons learned, fostering personal relationships was viewed as critical for success of this initiative.</p> <p>Conclusion</p> <p>Arab and Israeli health professionals were prepared to get involved for two types of reasons: a) Project Level: opportunity to address a significant health issue (e.g. congenital hearing loss) while enhancing their professional careers, and b) Meta Level: concern about taking positive steps for building cooperation in the region. We invite discussion about roles that health professionals can play in building "cooperation networks" for underpinning health security, conflict resolution and global health promotion.</p

    Comparison of scores for bimodality of gene expression distributions and genome-wide evaluation of the prognostic relevance of high-scoring genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A major goal of the analysis of high-dimensional RNA expression data from tumor tissue is to identify prognostic signatures for discriminating patient subgroups. For this purpose genome-wide identification of bimodally expressed genes from gene array data is relevant because distinguishability of high and low expression groups is easier compared to genes with unimodal expression distributions.</p> <p>Recently, several methods for the identification of genes with bimodal distributions have been introduced. A straightforward approach is to cluster the expression values and score the distance between the two distributions. Other scores directly measure properties of the distribution. The kurtosis, e.g., measures divergence from a normal distribution. An alternative is the outlier-sum statistic that identifies genes with extremely high or low expression values in a subset of the samples.</p> <p>Results</p> <p>We compare and discuss scores for bimodality for expression data. For the genome-wide identification of bimodal genes we apply all scores to expression data from 194 patients with node-negative breast cancer. Further, we present the first comprehensive genome-wide evaluation of the prognostic relevance of bimodal genes. We first rank genes according to bimodality scores and define two patient subgroups based on expression values. Then we assess the prognostic significance of the top ranking bimodal genes by comparing the survival functions of the two patient subgroups. We also evaluate the global association between the bimodal shape of expression distributions and survival times with an enrichment type analysis.</p> <p>Various cluster-based methods lead to a significant overrepresentation of prognostic genes. A striking result is obtained with the outlier-sum statistic (<it>p </it>< 10<sup>-12</sup>). Many genes with heavy tails generate subgroups of patients with different prognosis.</p> <p>Conclusions</p> <p>Genes with high bimodality scores are promising candidates for defining prognostic patient subgroups from expression data. We discuss advantages and disadvantages of the different scores for prognostic purposes. The outlier-sum statistic may be particularly valuable for the identification of genes to be included in prognostic signatures. Among the genes identified as bimodal in the breast cancer data set several have not yet previously been recognized to be prognostic and bimodally expressed in breast cancer.</p

    Discovering local patterns of co - evolution: computational aspects and biological examples

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Co-evolution is the process in which two (or more) sets of orthologs exhibit a similar or correlative pattern of evolution. Co-evolution is a powerful way to learn about the functional interdependencies between sets of genes and cellular functions and to predict physical interactions. More generally, it can be used for answering fundamental questions about the evolution of biological systems.</p> <p>Orthologs that exhibit a strong signal of co-evolution in a certain part of the evolutionary tree may show a mild signal of co-evolution in other branches of the tree. The major reasons for this phenomenon are noise in the biological input, genes that gain or lose functions, and the fact that some measures of co-evolution relate to rare events such as positive selection. Previous publications in the field dealt with the problem of finding sets of genes that co-evolved along an entire underlying phylogenetic tree, without considering the fact that often co-evolution is local.</p> <p>Results</p> <p>In this work, we describe a new set of biological problems that are related to finding patterns of <it>local </it>co-evolution. We discuss their computational complexity and design algorithms for solving them. These algorithms outperform other bi-clustering methods as they are designed specifically for solving the set of problems mentioned above.</p> <p>We use our approach to trace the co-evolution of fungal, eukaryotic, and mammalian genes at high resolution across the different parts of the corresponding phylogenetic trees. Specifically, we discover regions in the fungi tree that are enriched with positive evolution. We show that metabolic genes exhibit a remarkable level of co-evolution and different patterns of co-evolution in various biological datasets.</p> <p>In addition, we find that protein complexes that are related to gene expression exhibit non-homogenous levels of co-evolution across different parts of the <it>fungi </it>evolutionary line. In the case of mammalian evolution, signaling pathways that are related to <it>neurotransmission </it>exhibit a relatively higher level of co-evolution along the <it>primate </it>subtree.</p> <p>Conclusions</p> <p>We show that finding local patterns of co-evolution is a computationally challenging task and we offer novel algorithms that allow us to solve this problem, thus opening a new approach for analyzing the evolution of biological systems.</p

    Differential Localization and Independent Acquisition of the H3K9me2 and H3K9me3 Chromatin Modifications in the Caenorhabditis elegans Adult Germ Line

    Get PDF
    Histone methylation is a prominent feature of eukaryotic chromatin that modulates multiple aspects of chromosome function. Methyl modification can occur on several different amino acid residues and in distinct mono-, di-, and tri-methyl states. However, the interplay among these distinct modification states is not well understood. Here we investigate the relationships between dimethyl and trimethyl modifications on lysine 9 of histone H3 (H3K9me2 and H3K9me3) in the adult Caenorhabditis elegans germ line. Simultaneous immunofluorescence reveals very different temporal/spatial localization patterns for H3K9me2 and H3K9me3. While H3K9me2 is enriched on unpaired sex chromosomes and undergoes dynamic changes as germ cells progress through meiotic prophase, we demonstrate here that H3K9me3 is not enriched on unpaired sex chromosomes and localizes to all chromosomes in all germ cells in adult hermaphrodites and until the primary spermatocyte stage in males. Moreover, high-copy transgene arrays carrying somatic-cell specific promoters are highly enriched for H3K9me3 (but not H3K9me2) and correlate with DAPI-faint chromatin domains. We further demonstrate that the H3K9me2 and H3K9me3 marks are acquired independently. MET-2, a member of the SETDB histone methyltransferase (HMTase) family, is required for all detectable germline H3K9me2 but is dispensable for H3K9me3 in adult germ cells. Conversely, we show that the HMTase MES-2, an E(z) homolog responsible for H3K27 methylation in adult germ cells, is required for much of the germline H3K9me3 but is dispensable for H3K9me2. Phenotypic analysis of met-2 mutants indicates that MET-2 is nonessential for fertility but inhibits ectopic germ cell proliferation and contributes to the fidelity of chromosome inheritance. Our demonstration of the differential localization and independent acquisition of H3K9me2 and H3K9me3 implies that the trimethyl modification of H3K9 is not built upon the dimethyl modification in this context. Further, these and other data support a model in which these two modifications function independently in adult C. elegans germ cells

    Identifying microRNA/mRNA dysregulations in ovarian cancer

    Get PDF
    Abstract Background MicroRNAs are a class of noncoding RNA molecules that co-regulate the expression of multiple genes via mRNA transcript degradation or translation inhibition. Since they often target entire pathways, they may be better drug targets than genes or proteins. MicroRNAs are known to be dysregulated in many tumours and associated with aggressive or poor prognosis phenotypes. Since they regulate mRNA in a tissue specific manner, their functional mRNA targets are poorly understood. In previous work, we developed a method to identify direct mRNA targets of microRNA using patient matched microRNA/mRNA expression data using an anti-correlation signature. This method, applied to clear cell Renal Cell Carcinoma (ccRCC), revealed many new regulatory pathways compromised in ccRCC. In the present paper, we apply this method to identify dysregulated microRNA/mRNA mechanisms in ovarian cancer using data from The Cancer Genome Atlas (TCGA). Methods TCGA Microarray data was normalized and samples whose class labels (tumour or normal) were ambiguous with respect to consensus ensemble K-Means clustering were removed. Significantly anti-correlated and correlated genes/microRNA differentially expressed between tumour and normal samples were identified. TargetScan was used to identify gene targets of microRNA. Results We identified novel microRNA/mRNA mechanisms in ovarian cancer. For example, the expression level of RAD51AP1 was found to be strongly anti-correlated with the expression of hsa-miR-140-3p, which was significantly down-regulated in the tumour samples. The anti-correlation signature was present separately in the tumour and normal samples, suggesting a direct causal dysregulation of RAD51AP1 by hsa-miR-140-3p in the ovary. Other pairs of potentially biological relevance include: hsa-miR-145/E2F3, hsa-miR-139-5p/TOP2A, and hsa-miR-133a/GCLC. We also identified sets of positively correlated microRNA/mRNA pairs that are most likely result from indirect regulatory mechanisms. Conclusions Our findings identify novel microRNA/mRNA relationships that can be verified experimentally. We identify both generic microRNA/mRNA regulation mechanisms in the ovary as well as specific microRNA/mRNA controls which are turned on or off in ovarian tumours. Our results suggest that the disease process uses specific mechanisms which may be significant for their utility as early detection biomarkers or in the development of microRNA therapies in treating ovarian cancers. The positively correlated microRNA/mRNA pairs suggest the existence of novel regulatory mechanisms that proceed via intermediate states (indirect regulation) in ovarian tumorigenesis.</p

    Misty Mountain clustering: application to fast unsupervised flow cytometry gating

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There are many important clustering questions in computational biology for which no satisfactory method exists. Automated clustering algorithms, when applied to large, multidimensional datasets, such as flow cytometry data, prove unsatisfactory in terms of speed, problems with local minima or cluster shape bias. Model-based approaches are restricted by the assumptions of the fitting functions. Furthermore, model based clustering requires serial clustering for all cluster numbers within a user defined interval. The final cluster number is then selected by various criteria. These supervised serial clustering methods are time consuming and frequently different criteria result in different optimal cluster numbers. Various unsupervised heuristic approaches that have been developed such as affinity propagation are too expensive to be applied to datasets on the order of 10<sup>6 </sup>points that are often generated by high throughput experiments.</p> <p>Results</p> <p>To circumvent these limitations, we developed a new, unsupervised density contour clustering algorithm, called Misty Mountain, that is based on percolation theory and that efficiently analyzes large data sets. The approach can be envisioned as a progressive top-down removal of clouds covering a data histogram relief map to identify clusters by the appearance of statistically distinct peaks and ridges. This is a parallel clustering method that finds every cluster after analyzing only once the cross sections of the histogram. The overall run time for the composite steps of the algorithm increases linearly by the number of data points. The clustering of 10<sup>6 </sup>data points in 2D data space takes place within about 15 seconds on a standard laptop PC. Comparison of the performance of this algorithm with other state of the art automated flow cytometry gating methods indicate that Misty Mountain provides substantial improvements in both run time and in the accuracy of cluster assignment.</p> <p>Conclusions</p> <p>Misty Mountain is fast, unbiased for cluster shape, identifies stable clusters and is robust to noise. It provides a useful, general solution for multidimensional clustering problems. We demonstrate its suitability for automated gating of flow cytometry data.</p
    corecore